Word Length Frequency and Distribution in English: Observations, Theory and Implications for the Construction of Verse Lines
نویسندگان
چکیده
Recent observations in the theory of verse and empirical metrics have suggested that constructing a verse line involves a pattern-matching search through a source text, and that the number of found elements (complete words totaling a specified number of syllables) is given by dividing the total number of words by the mean number of syllables per word in the source text. This paper makes this latter point explicit mathematically, and in the course of this demonstration shows that the word length frequency totals in English output are distributed geometrically (previous researchers reported an adjusted Poisson distribution), and that the sequential distribution is random at the global level, with significant non-randomness in the fine structure. Data from a corpus of just under two million words, and a syllable-count lexicon of 71,000 word-forms is reported. The pattern-matching theory is shown to be internally coherent, and it is observed that some of the analytic techniques described here form a satisfactory test for regular (isometric) lineation in a text.
منابع مشابه
A ug 1 99 8 Word Length Frequency and Distribution in English : Observations , Theory , and Implications for the Construction of Verse Lines Hideaki
Recent observations in the theory of verse and empirical metrics have suggested that constructing a verse line involves a pattern-matching search through a source text, and that the number of found elements (complete words totaling a specified number of syllables) is given by dividing the total number of words by the mean number of syllables per word in the source text. This paper makes this la...
متن کاملA ug 1 99 8 Word Length Frequency and Distribution in English : Observations , Theory , and Implications for the Construction of Verse
Recent observations in the theory of verse and empirical metrics have suggested that constructing a verse line involves a pattern-matching search through a source text, and that the number of found elements (complete words totaling a specified number of syllables) is given by dividing the total number of words by the mean number of syllables per word in the source text. This paper makes this la...
متن کاملIsometric Lineation in English Texts: An Empirical and Mathematical Examination of its Character and Consequences
In this paper we build on earlier observations and theory regarding word length frequency and sequential distribution to develop a mathematical characterization of some of the language features distinguishing isometrically lineated text from unlineated text, in other words the features distinguishing isometrical verse from prose. It is shown that the frequency of Qn of n syllables making comple...
متن کاملDo We Need Discipline-Specific Academic Word Lists? Linguistics Academic Word List (LAWL)
This corpus-based study aimed at exploring the most frequently-used academic words in linguistics and compare the wordlist with the distribution of high frequency words in Coxhead’s Academic Word List (AWL) and West’s General Service List (GSL) to examine their coverage within the linguistics corpus. To this end, a corpus of 700 linguistics research articles (LRAC), consisting of approximately ...
متن کاملHigh- and Mid-Frequency Vocabulary Size as Predictors of Iranian University EFL Students’ Speaking Performance
Literature is replete with the studies focusing on the role of vocabulary knowledge in second language receptive skills. However, the relationship between the aspects of vocabulary knowledge and productive skills in general, and the speaking performance in particular has remained scanty in the related literature. This paper examined the relationship between knowledge of L2 vocabulary size at di...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره cmp-lg/9808004 شماره
صفحات -
تاریخ انتشار 1998